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An  Experiment  on  Target  Tracking  Via  Image  Segmentation 

1.  Introduction 

In  surveillance  and  tracking  using  imagery  sensors,  targets 
must  be  detected  and  tracked  at  various  signal- to-noise  ratios 
for  a  fixed  or  varying  target  size.  Although  the  scenes  consi¬ 
dered  may  be  quite  simple  consisting  of  one  or  multiple  targets 
in  a  noisy  background,  the  computational  algorithms  must  be 
also  simple  and  effective  to  meet  the  real-time  or  near  real¬ 
time  requirements.  Our  previous  study  £lj  has  demonstrated  such 
capability  by  using  the  Fisher's  linear  discriminant  to  segment 
the  infrared  images.  Recently,  a  Bayes  classifier  was  proposed 
[2  ]  for  the  pixel  classification  to  segment  a  scene  with  a  box  of 
different  Gaussian  statistics  from  the  noisy  background.  In 
their  approach,  object  (target)  parameters  such  as  size,  location 
etc.  calculated  from  projections  are  used  to  iteratively  improve 
the  decision  rule.  A  cost  function  is  used  to  derive  the  final 
segmentation.  An  entirely  different  approach  £3  ]  is  to  use  a 
semi-causal  recursive  filter  for  the  enhancement  of  image  such 
that  the  target  can  be  detected  and  tracked. 

In  this  paper,  the  approach  taken  in  Ref.  1  is  used  to 
examine  the  same  example  in  Ref.  2  for  a  comparative  study. 
Furthermore,  the  experiment  is  extended  to  a  dynamic  simulation 
of  a  moving  object  by  varying  the  size  of  the  target  sequentially. 
A  comparison  is  made  with  the  theoretical  performance.  The 
effect  of  the  learning  sample  size  on  detection  perform0"00  - 
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2.  Algorithm  for  Static  Scenes 

The  first  part  of  the  algorithm  is  for  a  static  scene  as 
considered  in  Ref.  2.  The  full  picture  is  of  32x32  pixels 
while  the  target  box  with  size  11x11  is  in  the  center  of  the 
scene.  Extensive  experimental  study  shows  that  the  feature  vector 
consisting  of  two  components:  the  gray  level  of  the  pixel,  and 
the  average  gray  level  of1  the  3x3  neighborhood  performs  the  best. 
This  feature  vector  is  then  used  throughout  the  target  tracking 
study.  For  the  learning  samples,  100  pixels  from  the  target 
region  (class  1)  and  100  pixels  from  the  background  (class  2)  are 
selected.  The  target  region  has  a  Gaussian  distribution  with 
mean  10  and  variance  2  while  the  background  region  has  a  Gaussian 
distribution  with  mean  8  and  variance  2.  The  Fisher's  linear 


discriminant  is  used  for  pixel  classification  to  segment  the  image 
into  target  region  and  background  region.  The  experimental 
probability  of  pixel  misclassif ication  can  then  be  compared  with 

the  theoretical  value  given  by  |ll 

I  C*  y2 
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where  u  is  the  "norm"  which  in  this  case  is  the  Mahalanobi^ 
distance  between  the  two  classes  based  on  the  pooled  scattered 
matrix.  Fig.  1  shows  the  experimental  and  theoretical  errors  in 
a  reasonable  aggreement,  as  a  function  of  u.  The  larger  the 
error,  the  more  difficult  is  the  target  detection. 


Fig.  2a  is  the  original  artificial  32x32  image  as  generated 
and  displayed  on  the  AED-512  terminal  in  256  gray  levels.  Fig.  2b 


shows  the  Fisher’s  linear  discriminant  result.  10  errors  are 
in 

detected^ the  target,  which  is  slightly  better  than  the  result 
reported  in  Ref.  2. 

3.  Algorithm  for  the  Time-Varying  Images 

The  main  part  of  the  algorithm  is  used  to  segment  the  time- 
varying  images.  We  select  four  32x32  pictures  with  the  target 
sizes  10x10,  8x8,  6x6  and  4x4  respectively.  The  results  are  as 
follows. 


(1)  The  picture  with  10x10  target  (Fig.  3) 


learning  sample  size 

error  percentage 

10x10 

4.00  (Fig.  3b) 

8x8 

4.88  (Fig.  3c) 

6x6 

7.81  (Fig.  3d) 

4x4 

9.67  (Fig.  3e) 

(2)  The  picture  with  8x8 

target 

(Fig.  4) 

learning  sample  size 

error  percentage 

10x10 

3-91  (Fig.  4b) 

8x8 

4.79  (Fig.  4c) 

6x6 

8.10  (Fig.  4d) 

4x4 

’9.96  (Fig.  4e) 

(3)  The  picture  with  6x6 

target 

(Fig.  5) 

learning  sample  size 

error  percentage 

10x10 

3.71  (Fig.  5b) 

8x8 

4.69  (Fig.  5c) 

6x6 

8.01  (Fig.  5d) 

4x4 

10.06  (Fig.  5e) 

rm.T 
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(4)  The  picture  with  4x4  target  (Fig.  6) 
learning  sample  size  error  percentage 


10x10 

2.83 

(Fig. 

6b) 

8x8 

3.71 

(Fig. 

6c) 

6x6 

7.32 

(Fig. 

6d) 

4x4 

9.38 

(Fig. 

6e) 

From  the  above  results,  it  is  concluded  that  for  a  given 
target  size,  better  detection  is  available  with  a  larger  learning 
sample  size.  Fig.  7  is  a  plot  of  empirical  relationship  between 
the  percentage  error  and  the  learning  sample  size.  We  also  notice 
that  even  when  the  target  is  small  as  it  just  appears  on  the  scene, 
good  detection  is  possible  by  taking  a  large  size  learning  sample. 
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Fig.  1  Experimental  and  theoretical  errors 


(a)  original  artificial 

picture  with  target  Cllx/I) 

box  of  N(l0,2)  and 

background  of  N(8,2) 


(b)  result  of  using  Fisher's 
linear  discriminant ,  with 
10  errors  in  the  target  box. 
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Figure  3 
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(c)  detection  result 
!  using  64  learning 
'  samples 
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(d)  detection  result 
using  36  learning 
samples 
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(e)  detection  result 
using  16  learning 
samples 
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Fig.  3  (continued) 
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(a)  the  32x32  picture  with 
8x8  object  box 


(b)  detection  result  using 
100  learning  samples 
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(e)  detection  result 
using  16  learning 
samples 


Fig.  A  (continued) 


(a)  the  32x32  picture  with 
4x4  object  box 


(b)  detection  result  using 
i  100  learning  samples 


1(c)  detection  result 
j  vising  64  learning 
samples 
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(d)  detection  result 
using  36  learning 
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Fig.  7  percentage  error  versus  learning  sample  size  for 
various  target  sizes. 
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